Automatic Construction of Movie Domain Korean Sentiment Dictionary Using Online Movie Reviews
نویسندگان
چکیده
We present a method of automatically constructing a domain-specific Korean sentiment dictionary which can be used to classify the sentiment of online movie reviews. More than 1.18 million online movie reviews with movie ratings ranging between 1 to 4 and 7 to 10 were collected across fourteen different movie genres to calculate the joint probability of a given word and the sentiment of movie reviews for each genre. In particular, the joint probability of (1) a given word and the positive movie reviews that contain movie ratings 7 to 10 and (2) a given word and the negative movie reviews that contain movie ratings 1 to 4 for each movie genre were calculated. The difference between the two joint probabilities (i.e., (1) – (2)) was obtained for each word in each genre, and the fourteen genres’ joint probability differences of each word were averaged. Finally, the averaged joint probability difference values were normalized to range between -1 and 1. These normalized values were utilized as the sentiment values of each word in the final 135,082-word movie domain Korean sentiment dictionary. The positive/negative binary sentiment classification performance of the constructed sentiment dictionary was evaluated using test data, and the balanced accuracy of 80.7% was achieved, confirming the effectiveness of the proposed sentiment dictionary construction method.
منابع مشابه
A Supervised Method for Constructing Sentiment Lexicon in Persian Language
Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...
متن کاملYouTube Movie Reviews: In, Cross, and Open-domain Sentiment Analysis in an Audiovisual Context
In this contribution we focus on the task of automatically analyzing a speaker’s sentiment in on-line videos containing movie reviews. In addition to textual information, we consider adding audio features as typically used in speech-based emotion recognition as well as video features encoding valuable valence information conveyed by the speaker. We combine this multi-modal experimental setup wi...
متن کاملMachine Learning-based Sentiment Analysis of Automatic Indonesian Translations of English Movie Reviews
Sentiment analysis is the automatic classification of the overall opinion conveyed by a text towards its subject matter. This paper discusses an experiment in the sentiment analysis of of a collection of movie reviews that have been automatically translated to Indonesian. Following [1], we employ three well known classification techniques: naive bayes, maximum entropy, and support vector machin...
متن کاملSentiment Classification and Feature based Summarization of Movie Reviews in Mobile Environment
A new framework is designed for sentiment classification and feature based summarization system in a mobile environment. Posting online reviews has become an increasingly popular way for people to share their opinions about specific product or service with other users. It has become a common practice for web technologies to provide the venues and facilities for people to publish their reviews. ...
متن کاملSentiment Analysis of movie reviews using SentiWordNet Approach
In this paper, a new kind of domain specific feature-based heuristic for sentiment analysis of movie reviews using aspect-level is presented. The unsupervised learning technique for sentiment classification is used. The SentiWordNet based scheme using two different linguistic feature selections containing adjectives, adverbs and verbs and n-gram feature extraction is performed. In aspect orient...
متن کامل